Due to their importance, there is plethora of excellent algorithms for symmetric tridiagonal matrices.
For more details, see I. Slapničar, Symmetric Matrix Eigenvalue Techniques and the references therein.
Prerequisites
The reader should be familiar with concepts of eigenvalues and eigenvectors, related perturbation theory, and algorithms.
Competences
The reader should be able to apply adequate algorithm to a given symmetric tridiagonal matrix and to assess the speed of the algorithm and the accuracy of the solution.
The bisection method is convenient if only part of the spectrum is needed. If the eigenvectors are needed, as well, they can be efficiently computed by the inverse iteration method.
$A$ is a real symmetric $n\times n$ matrix and $T$ is a real symmetric tridiagonal $n\times n$ matrix.
Application of Sylvester's Theorem. Let $\alpha,\beta\in\mathbb{R}$ with $\alpha<\beta$. The number of eigenvalues of $A$ in the interval $[\alpha,\beta)$ is equal to $\nu (A- \beta I)-\nu(A-\alpha I)$. By systematically choosing the intervals $[\alpha,\beta)$, the bisection method pinpoints each eigenvalue of $A$ to any desired accuracy.
The factorization $T-\mu I=LDL^T$, where $D=\mathop{\mathrm{diag}}(d_1,\dots,d_n)$ and $L$ is the unit lower bidiagonal matrix, is computed as: \begin{align*} & d_{1}=T_{11}-\mu, \quad d_{i}=(T_{ii}-\mu)- \frac{T_{i,i-1}^2}{d_{i-1}}, \quad i=2,\ldots n, \\ & L_{i+1,i}=\frac{T_{i+1,i}}{d_{i}}, \quad i=1,\ldots,n-1. \end{align*} Since the matrices $T$ and $D$ have the same inertia, this recursion enables an efficient implementation of the bisection method for $T$.
The factorization from Fact 2 is essentially Gaussian elimination without pivoting. Nevertheless, if $d_i\neq 0$ for all $i$, the above recursion is very stable. Even when $d_{i-1}=0$ for some $i$, if the IEEE arithmetic is used, the computation will continue and the inertia will be computed correctly. Namely, in that case, we would have $d_i=-\infty$, $L_{i+1,i}=0$, and $d_{i+1}=T_{i+1.i+1}-\mu$.
Computing one eigenvalue of $T$ by using the recursion from Fact 2 and bisection requires $O(n)$ operations. The corresponding eigenvector is computed by inverse iteration. The convergence is very fast, so the cost of computing each eigenvector is also $O(n)$ operations. Therefore, the overall cost for computing all eigenvalues and eigenvectors is $O(n^2)$ operations.
Both, bisection and inverse iteration are highly parallel since each eigenvalue and eigenvector can be computed independently.
If some of the eigenvalues are too close, the corresponding eigenvectors computed by inverse iteration may not be sufficiently orthogonal. In this case, it is necessary to orthogonalize these eigenvectors (for example, by the modified Gram--Schmidt procedure). If the number of close eigenvalues is too large, the overall operation count can increase to $O(n^3)$.
The EVD computed by bisection and inverse iteration satisfies the error bounds from previous notebook.
The bisection method for tridiagonal matrices is implemented in the LAPACK subroutine DSTEBZ. This routine can compute all eigenvalues in a given interval or the eigenvalues from $\lambda_l$ to $\lambda_k$, where $l<k$, and the eigenvalues are ordered from smallest to largest. Inverse iteration (with reorthogonalization) is implemented in the LAPACK subroutine DSTEIN.
In [22]:
import Random
Random.seed!(423)
using LinearAlgebra
n=6
T=SymTridiagonal(rand(n),rand(n-1))
Out[22]:
In [23]:
λ,U=eigen(T)
Out[23]:
In [24]:
# Fact 2
function myLDLt(T::SymTridiagonal{S},μ::S) where S<:Real
n=length(T.dv)
D=Diagonal(Vector{S}(undef,n))
L=Bidiagonal(fill(one(S),n),Vector{S}(undef,n-1),'L')
D.diag[1]=T.dv[1]-μ
for i=2:n
D.diag[i]=(T.dv[i]-μ)-T.ev[i-1]^2/D.diag[i-1]
end
for i=1:n-1
L.ev[i]=T.ev[i]/D.diag[i]
end
return D,L
end
Out[24]:
In [25]:
σ=1.0
Out[25]:
In [26]:
D,L=myLDLt(T,σ)
Out[26]:
In [28]:
L
Out[28]:
In [29]:
Matrix(L*D*transpose(L))-(T-σ*I)
Out[29]:
In [30]:
# Inertias are the same
[λ.-σ D.diag]
Out[30]:
In [31]:
# Fact 8
methods(LAPACK.stebz!)
Out[31]:
In [32]:
?LAPACK.stebz!
Out[32]:
In [36]:
λ₁, rest=LAPACK.stebz!('A','E',1.0,1.0,1,1,2*eps(),T.dv,T.ev)
Out[36]:
In [37]:
λ-λ₁
Out[37]:
In [39]:
U₁=LAPACK.stein!(T.dv,T.ev,λ₁)
Out[39]:
In [40]:
λ
Out[40]:
In [41]:
# Residual
norm(T*U₁-U₁*Diagonal(λ₁))
Out[41]:
In [42]:
U₁'*U₁
Out[42]:
In [43]:
# Let us compute just some eigenvalues - from 2nd to 4th
λ₂,rest=LAPACK.stebz!('V','E',0.0,1.0,2,4,2*eps(),T.dv,T.ev)
Out[43]:
In [44]:
# And the corresponding eigenvectors
U₂=LAPACK.stein!(T.dv,T.ev,λ₂)
Out[44]:
This is currently the fastest method for computing the EVD of a real symmetric tridiagonal matrix $T$. It is based on splitting the given tridiagonal matrix into two matrices, then computing the EVDs of the smaller matrices and computing the final EVD from the two EVDs.
$T$ is a real symmetric tridiagonal matrix of order $n$ and $T=U\Lambda U^T$ is its EVD.
Let $T$ be partitioned as $$ T=\begin{bmatrix} T_1 & \alpha_k e_k e_1^T \\ \alpha_k e_1 e_k^T & T_2 \end{bmatrix}. $$ We assume that $T$ is unreduced, that is, $\alpha_i\neq 0$ for all $i$. Further, we assume that $\alpha_i>0$ for all $i$, which can be easily be attained by diagonal similarity with a diagonal matrix of signs. Let $$ \hat T_1=T_1- \alpha_k e_k e_k^T,\qquad \hat T_2=T_2- \alpha_k e_1 e_1^T. $$ In other words, $\hat T_1$ is equal to $T_1$ except that $T_{kk}$ is replaced by $T_{kk}-\alpha_k$, and $\hat T_2$ is equal to $T_2$ except that $T_{k+1,k+1}$ is replaced by $T_{k+1,k+1}-\alpha_k$. Let $\hat T_i= \hat U_i \hat \Lambda_i \hat U_i^T$, $i=1,2$, be the respective EVDs and let $v=\begin{bmatrix} \hat U_1^T e_k \\ \hat U_2^T e_1 \end{bmatrix} $ ($v$ consists of the last column of $\hat U_1^T$ and the first column of $\hat U_2^T$). Set $\hat U=\hat U_1\oplus \hat U_2$ and $\hat \Lambda=\hat \Lambda_1 \oplus \hat \Lambda_2$. Then $$ T=\begin{bmatrix}\hat U_1 & \\ & \hat U_2 \end{bmatrix} \left[\begin{bmatrix} \hat \Lambda_1 & \\ & \hat \Lambda_2 \end{bmatrix} + \alpha_k v v^T\right] \begin{bmatrix} \hat U_1^T & \\ & \hat U_2^T \end{bmatrix}= \hat U (\hat \Lambda + \alpha_k v v^T) \hat U^T. $$ If $\hat \Lambda + \alpha_k v v^T=X\Lambda X^T$ is the EVD of the rank-one modification of the diagonal matrix $\hat \Lambda$, then $T=U\Lambda U^T$, where $U=\hat U X$ is the EVD of $T$. Thus, the original tridiagonal eigenvalue problem is reduced to two smaller tridiagonal eigenvalue problems and one eigenvalue problem for the diagonal-plus-rank-one matrix.
If all $\hat \lambda_i$ are different, then the eigenvalues $\lambda_i$ of $\hat \Lambda + \alpha_k v v^T$ are solutions of the so-called secular equation, $$ 1+\alpha_k \sum_{i=1}^n \frac{v_i^2}{\hat \lambda_i-\lambda}=0. $$ The eigenvalues can be computed by bisection, or by some faster zero finder of the Newton type, and they need to be computed as accurately as possible. The corresponding eigenvectors are $$ x_i=(\hat \Lambda-\lambda_i I)^{-1}v. $$
Each $\lambda_i$ and $x_i$ is computed in in $O(n)$ operations, respectively, so the overall computational cost for computing the EVD of $\hat \Lambda + \alpha_k v v^T$ is $O(n^2)$ operations.
The method can be implemented so that the accuracy of the computed EVD is given by the bound from the previous notebook.
Tridiagonal Divide-and-conquer method is implemented in the LAPACK subroutine DSTEDC. This routine can compute just the eigenvalues or both, eigenvalues and eigenvectors.
The file lapack.jl contains wrappers for a selection of LAPACK routines needed in the current Julia Base
. However, all LAPACK routines are in the compiled library, so additional wrappers can be easily written. Notice that arrays are passed directly and scalars as passed as pointers. The wrapper for DSTEDC
, similar to the ones from the file
lapack.jl
follows.
In [45]:
T
Out[45]:
In [72]:
T₁=T[1:3,1:3]
T₂=T[4:6,4:6]
α=T[3,4]
T₁[3,3]-=α
T₂[1,1]-=α
D₁,U₁=eigen(Matrix(T₁))
D₂,U₂=eigen(Matrix(T₂))
x=zeros(6)
x[3]=sqrt(α)
x[4]=sqrt(α)
T-[T₁ zeros(3,3);zeros(3,3) T₂]-x*x'
U=[U₁ zeros(3,3);zeros(3,3) U₂]
Out[72]:
In [73]:
U'*T*U
Out[73]:
In [77]:
v=U'*x;
In [76]:
Diagonal([D₁;D₂])+v*v'
Out[76]:
In [ ]:
LAPACK.
In [80]:
# Part of the preamble of lapack.jl
const liblapack = Base.liblapack_name
import LinearAlgebra.BLAS.@blasfunc
# import ..LinAlg: BlasFloat, Char, BlasInt, LAPACKException,
# DimensionMismatch, SingularException, PosDefException, chkstride1, chksquare
import LinearAlgebra.BlasInt
function chklapackerror(ret::BlasInt)
if ret == 0
return
elseif ret < 0
throw(ArgumentError("invalid argument #$(-ret) to LAPACK call"))
else # ret > 0
throw(LAPACKException(ret))
end
end
Out[80]:
In [81]:
for (stedc, elty) in
((:dstedc_,:Float64),
(:sstedc_,:Float32))
@eval begin
"""
COMPZ is CHARACTER*1
= 'N': Compute eigenvalues only.
= 'I': Compute eigenvectors of tridiagonal matrix also.
= 'V': Compute eigenvectors of original dense symmetric
matrix also. On entry, Z contains the orthogonal
matrix used to reduce the original matrix to
tridiagonal form.
"""
function stedc!(compz::Char, dv::Vector{$elty}, ev::Vector{$elty},
Z::Array{$elty})
n = length(dv)
ldz=n
if length(ev) != n - 1
throw(DimensionMismatch("ev has length $(length(ev))
but needs one less than dv's length, $n)"))
end
w = deepcopy(dv)
u = deepcopy(ev)
lwork=5*n^2
work = Array{$elty}(undef,lwork)
liwork=6+6*n+5*n*round(Int,ceil(log(n)/log(2)))
iwork = Array{BlasInt}(undef,liwork)
# info = Array{BlasInt}(undef,5)
info = Ref{BlasInt}()
ccall((@blasfunc($stedc), liblapack), Cvoid,
(Ref{UInt8}, Ref{BlasInt}, Ptr{$elty},
Ptr{$elty}, Ptr{$elty}, Ref{BlasInt}, Ptr{$elty},
Ref{BlasInt}, Ptr{BlasInt}, Ref{BlasInt},
Ptr{BlasInt}), compz, n, w, u, Z, ldz, work,
lwork, iwork, liwork, info)
chklapackerror(info[])
return w,Z
end
end
end
In [83]:
μ,Q=stedc!('I',T.dv,T.ev,Matrix{Float64}(I,n,n))
Out[83]:
In [84]:
λ-μ
Out[84]:
In [85]:
Q'*T*Q
Out[85]:
In [86]:
# Timings
n=3000
Tbig=SymTridiagonal(rand(n),rand(n-1));
In [90]:
@time eigen(Tbig);
@time stedc!('I',Tbig.dv,Tbig.ev,Matrix{Float64}(I,n,n));
In [93]:
# Simple version
function DivideConquer(T::SymTridiagonal{S}) where S
n=length(T.dv)
U=Matrix{S}(undef,n,n)
Λ=Vector{S}(undef,n)
if n==1
Λ=T.dv[1]
U=[one(S)]
else
k=div(n,2)
T₁=SymTridiagonal(T.dv[1:k],T.ev[1:k-1])
T₁.dv[k]-=T.ev[k]
T₂=SymTridiagonal(T.dv[k+1:n],T.ev[k+1:n-1])
T₂.dv[1]-=T.ev[k]
Λ₁,U₁=DivideConquer(T₁)
Λ₂,U₂=DivideConquer(T₂)
v=vcat(transpose(U₁)[:,k],transpose(U₂)[:,1])
D=Diagonal(vcat(Λ₁,Λ₂))
Λ=eigvals(D+T.ev[k]*v*transpose(v))
for i=1:n
U[:,i]=(D-Λ[i]*I)\v
normalize!(view(U,:,i))
end
U[1:k,:]=U₁*U[1:k,:]
U[k+1:n,:]=U₂*U[k+1:n,:]
end
return Λ, U
end
Out[93]:
In [94]:
Λ,U=DivideConquer(T)
Out[94]:
In [95]:
[Λ eigvals(T)]
Out[95]:
In [96]:
norm(T*U-U*Diagonal(Λ))
Out[96]:
The method of Multiple Relatively Robust Representations
The computation of the tridiagonal EVD which satisfies the error standard error bounds such that the eigenvectors are orthogonal to working precision, all in $O(n^2)$ operations, has been the holy grail of numerical linear algebra for a long time. The method of Multiple Relatively Robust Representations does the job, except in some exceptional cases. The key idea is to implement inverse iteration more carefully. The practical algorithm is quite elaborate and the reader is advised to consider references.
The MRRR method is implemented in the LAPACK subroutine DSTEGR. This routine can compute just the eigenvalues, or both eigenvalues and eigenvectors.
In [98]:
methods(LAPACK.stegr!)
Out[98]:
In [99]:
?LAPACK.stegr!
Out[99]:
In [101]:
LAPACK.stegr!('V',T.dv,T.ev)
Out[101]:
In [102]:
λ
Out[102]:
In [104]:
# Timings
@time LAPACK.stegr!('V',Tbig.dv,Tbig.ev);
In [105]:
n=1500
Tbig=SymTridiagonal(rand(n),rand(n-1));
In [107]:
@time LAPACK.stegr!('V',Tbig.dv,Tbig.ev);
In [13]:
using Plots
k=20
n=20
E=Array{ComplexF64}(undef,n,k)
# Unsymmetric random uniform distribution
for i=1:k
A=randn(n,n)
E[:,i]=eigvals(A)
end
In [14]:
scatter(E,legend=false)
Out[14]:
In [20]:
k=50
n=20
E=Array{Float64}(undef,n,k)
# Symmetric random uniform distribution
for i=1:k
A=Symmetric(randn(n,n))
E[:,i]=eigvals(A)
end
In [21]:
scatter(E,legend=false)
Out[21]:
In [ ]: